An MPI prototype for compiled communication on Ethernet switched clusters

نویسندگان

  • Amit Karwande
  • Xin Yuan
  • David K. Lowenthal
چکیده

Compiled communication has recently been proposed to improve communication performance for clusters of workstations. The idea of compiled communication is to apply more aggressive optimizations to communications whose information is known at compile time. Existing MPI libraries do not support compiled communication. In this paper, we present an MPI prototype, CC–MPI, that supports compiled communication on Ethernet switched clusters. The unique feature of CC–MPI is that it allows the user to manage network resources such as multicast groups directly and to optimize communications based on the availability of the communication information. CC–MPI optimizes one–to–all, one–to–many, all–to–all, and many–to–many collective communication routines using the compiled communication technique. We describe the techniques used in CC–MPI and report its performance. The results show that communication performance of Ethernet switched clusters can be significantly improved through compiled communication.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fast Barrier Synchronization on Shared Fast Ethernet

Shared LAN is presently the most widespread networking technology, due to its extremely low cost and favourable cost/performance ratio. Clusters of Personal Computers (PCs) leveraging shared 100base-T Ethernet may currently ooer the best price/performance in parallel processing. Most numerical parallel algorithms make heavy use of collective communications and especially barrier synchronization...

متن کامل

Low Overhead Ethernet Communication for Open MPI on Linux Clusters

This paper describes the basic concepts of our solution to improve the performance of Ethernet Communication on a Linux Cluster environment by introducing Reliable Low Latency Ethernet Sockets. We show that about 25% of the socket latency can be saved by using our simplified protocol. Especially, we put emphasis on demonstrating that this performance benefit is able to speed up the MPI level co...

متن کامل

Performance and scalability of MPI on PC clusters

Abstract. The purpose of this paper is to compare the communication performance and scalability of MPI communication routines on an NT cluster, a Myrinet Linux cluster, an Ethernet Linux cluster, a Cray T3E-600, and an SGI Origin 2000. All tests in this paper were run for the various numbers of processors and 2 message sizes. For most of the MPI tests used in this paper, the T3E-600 and Origin ...

متن کامل

Improved GROMACS Scaling on Ethernet Switched Clusters

We investigated the prerequisites for decent scaling of the GROMACS 3.3 molecular dynamics (MD) code [1] on Ethernet Beowulf clusters. The code uses the MPI standard for communication between the processors and scales well on shared memory supercomputers like the IBM p690 (Regatta) and on Linux clusters with a high-bandwidth/low latency network. On Ethernet switched clusters, however, the scali...

متن کامل

BC-MPI: Running an MPI Application on Multiple Clusters with BeesyCluster Connectivity

A new software package BC-MPI which allows an MPI application to run on several clusters with various MPI implementations is presented. It uses vendor MPI implementations for communication inside clusters and exploits the multithreaded MPI THREAD MULTIPLE mode for handling inter-cluster communication in additional threads of the MPI application. Furthermore, a BC-MPI application can be automati...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • J. Parallel Distrib. Comput.

دوره 65  شماره 

صفحات  -

تاریخ انتشار 2005